在这项工作中,我们为面部年龄编辑提出了一种新颖的架构,该架构可以产生结构修改,同时保持原始图像中存在相关细节。我们删除输入图像的样式和内容,并提出了一个新的解码器网络,该网络采用了一种基于样式的策略来结合输入图像的样式和内容表示,同时将输出在目标年龄上调节。我们超越了现有的衰老方法,使用户可以在推理过程中调整输入图像中的结构保存程度。为此,我们引入了一种掩盖机制,即自定义结构保存模块,该模块将输入图像中的相关区域与应丢弃的区域区分开。尖峰不需要其他监督。最后,我们的定量和定性分析在内,包括用户研究,表明我们的方法优于先前的艺术,并证明了我们在图像编辑和可调节结构保存方面的策略的有效性。可以在https://github.com/guillermogogotre/cusp上获得代码和预估计的模型。
translated by 谷歌翻译
3D点云语义细分对于自动驾驶至关重要。文献中的大多数方法都忽略了一个重要方面,即在处理动态场景时如何处理域转移。这可能会极大地阻碍自动驾驶车辆的导航能力。本文推进了该研究领域的最新技术。我们的第一个贡献包括分析点云细分中的新的未开发的方案,即无源的在线无监督域改编(SF-OUDA)。我们在实验上表明,最新的方法具有相当有限的能力,可以使预训练的深网模型以在线方式看不到域。我们的第二个贡献是一种依赖于自适应自我训练和几何传播的方法,以在线调整预训练的源模型,而无需源数据或目标标签。我们的第三个贡献是在一个充满挑战的设置中研究sf-ouda,其中源数据是合成的,目标数据是现实世界中捕获的点云。我们将最近的Synlidar数据集用作合成源,并引入了两个新的合成(源)数据集,这些数据集可以刺激未来的综合自动驾驶研究。我们的实验显示了我们分割方法对数千个现实点云的有效性。代码和合成数据集可在https://github.com/saltoricristiano/gipso-sfouda上找到。
translated by 谷歌翻译
持续学习(CL)旨在制定模仿人类能力顺序学习新任务的能力,同时能够保留从过去经验获得的知识。在本文中,我们介绍了内存约束在线连续学习(MC-OCL)的新问题,这对存储器开销对可能算法可以用于避免灾难性遗忘的记忆开销。最多,如果不是全部,之前的CL方法违反了这些约束,我们向MC-OCL提出了一种算法解决方案:批量蒸馏(BLD),基于正则化的CL方法,有效地平衡了稳定性和可塑性,以便学习数据流,同时保留通过蒸馏解决旧任务的能力。我们在三个公开的基准测试中进行了广泛的实验评估,经验证明我们的方法成功地解决了MC-OCL问题,并实现了需要更高内存开销的先前蒸馏方法的可比准确性。
translated by 谷歌翻译
Image animation consists of generating a video sequence so that an object in a source image is animated according to the motion of a driving video. Our framework addresses this problem without using any annotation or prior information about the specific object to animate. Once trained on a set of videos depicting objects of the same category (e.g. faces, human bodies), our method can be applied to any object of this class. To achieve this, we decouple appearance and motion information using a self-supervised formulation. To support complex motions, we use a representation consisting of a set of learned keypoints along with their local affine transformations. A generator network models occlusions arising during target motions and combines the appearance extracted from the source image and the motion derived from the driving video. Our framework scores best on diverse benchmarks and on a variety of object categories. Our source code is publicly available 1 .
translated by 谷歌翻译
We explore the abilities of two machine learning approaches for no-arbitrage interpolation of European vanilla option prices, which jointly yield the corresponding local volatility surface: a finite dimensional Gaussian process (GP) regression approach under no-arbitrage constraints based on prices, and a neural net (NN) approach with penalization of arbitrages based on implied volatilities. We demonstrate the performance of these approaches relative to the SSVI industry standard. The GP approach is proven arbitrage-free, whereas arbitrages are only penalized under the SSVI and NN approaches. The GP approach obtains the best out-of-sample calibration error and provides uncertainty quantification.The NN approach yields a smoother local volatility and a better backtesting performance, as its training criterion incorporates a local volatility regularization term.
translated by 谷歌翻译
Purpose: The purpose of this paper is to present a method for real-time 2D-3D non-rigid registration using a single fluoroscopic image. Such a method can find applications in surgery, interventional radiology and radiotherapy. By estimating a three-dimensional displacement field from a 2D X-ray image, anatomical structures segmented in the preoperative scan can be projected onto the 2D image, thus providing a mixed reality view. Methods: A dataset composed of displacement fields and 2D projections of the anatomy is generated from the preoperative scan. From this dataset, a neural network is trained to recover the unknown 3D displacement field from a single projection image. Results: Our method is validated on lung 4D CT data at different stages of the lung deformation. The training is performed on a 3D CT using random (non domain-specific) diffeomorphic deformations, to which perturbations mimicking the pose uncertainty are added. The model achieves a mean TRE over a series of landmarks ranging from 2.3 to 5.5 mm depending on the amplitude of deformation. Conclusion: In this paper, a CNN-based method for real-time 2D-3D non-rigid registration is presented. This method is able to cope with pose estimation uncertainties, making it applicable to actual clinical scenarios, such as lung surgery, where the C-arm pose is planned before the intervention.
translated by 谷歌翻译
Although deep networks have shown vulnerability to evasion attacks, such attacks have usually unrealistic requirements. Recent literature discussed the possibility to remove or not some of these requirements. This paper contributes to this literature by introducing a carpet-bombing patch attack which has almost no requirement. Targeting the feature representations, this patch attack does not require knowing the network task. This attack decreases accuracy on Imagenet, mAP on Pascal Voc, and IoU on Cityscapes without being aware that the underlying tasks involved classification, detection or semantic segmentation, respectively. Beyond the potential safety issues raised by this attack, the impact of the carpet-bombing attack highlights some interesting property of deep network layer dynamic.
translated by 谷歌翻译
Computer vision and machine learning are playing an increasingly important role in computer-assisted diagnosis; however, the application of deep learning to medical imaging has challenges in data availability and data imbalance, and it is especially important that models for medical imaging are built to be trustworthy. Therefore, we propose TRUDLMIA, a trustworthy deep learning framework for medical image analysis, which adopts a modular design, leverages self-supervised pre-training, and utilizes a novel surrogate loss function. Experimental evaluations indicate that models generated from the framework are both trustworthy and high-performing. It is anticipated that the framework will support researchers and clinicians in advancing the use of deep learning for dealing with public health crises including COVID-19.
translated by 谷歌翻译
Deep learning surrogate models are being increasingly used in accelerating scientific simulations as a replacement for costly conventional numerical techniques. However, their use remains a significant challenge when dealing with real-world complex examples. In this work, we demonstrate three types of neural network architectures for efficient learning of highly non-linear deformations of solid bodies. The first two architectures are based on the recently proposed CNN U-NET and MAgNET (graph U-NET) frameworks which have shown promising performance for learning on mesh-based data. The third architecture is Perceiver IO, a very recent architecture that belongs to the family of attention-based neural networks--a class that has revolutionised diverse engineering fields and is still unexplored in computational mechanics. We study and compare the performance of all three networks on two benchmark examples, and show their capabilities to accurately predict the non-linear mechanical responses of soft bodies.
translated by 谷歌翻译
In Novel Class Discovery (NCD), the goal is to find new classes in an unlabeled set given a labeled set of known but different classes. While NCD has recently gained attention from the community, no framework has yet been proposed for heterogeneous tabular data, despite being a very common representation of data. In this paper, we propose TabularNCD, a new method for discovering novel classes in tabular data. We show a way to extract knowledge from already known classes to guide the discovery process of novel classes in the context of tabular data which contains heterogeneous variables. A part of this process is done by a new method for defining pseudo labels, and we follow recent findings in Multi-Task Learning to optimize a joint objective function. Our method demonstrates that NCD is not only applicable to images but also to heterogeneous tabular data.
translated by 谷歌翻译